remove random functions in CMath #3906

MikeLing · 2017-07-06T08:28:14Z

No description provided.

MikeLing · 2017-07-06T08:33:46Z

src/shogun/base/init.cpp

@@ -38,6 +37,7 @@ namespace shogun
 	SGIO* sg_io=NULL;
 	Version* sg_version=NULL;
 	CMath* sg_math=NULL;
+	__int32_t sg_random_seed = shogun::CRandom::generate_seed();


mmm, the logic in here is a bit of odd in here. I define the global random seed in init.cpp because here is where we define global variable. Then I do setter and getter function in SGObject in here due to we need to reference shogun/base/init.h every time if we want to call set_global_seed() or getter otherwise.

the generate_seed function should not be part anymore of CRandom...

the generate_seed function should not be part anymore of CRandom...

So we need to move it to CSGObject?

no i would put it into init.cpp as a static function

MikeLing · 2017-07-06T08:52:26Z

src/shogun/mathematics/Random.cpp

-#include <shogun/lib/Time.h>
-#include <shogun/lib/Lock.h>
+#include <shogun/mathematics/Math.h>
+#include <shogun/mathematics/Random.h>


Hi @vigsterkr, for using global random seed in default CRandom ctor. If I write it like https://gist.github.com/MikeLing/c5d618c176a14aec06f3411ce1b30c5e I will get a undefined symbols for architecture x86_64 c++ error. But everything works if I use CSGObject::get_global_seed() instead of sg_random_seed instead. I don't know why this happened :(

vigsterkr · 2017-07-06T09:05:54Z

tests/unit/kernel/CustomKernel_unittest.cc

 #include <shogun/mathematics/Math.h>
-#include <gtest/gtest.h>
+#include <shogun/mathematics/Random.cpp>


why would you import Random.cpp here?!

that's a mistake, let me remove it right away!

vigsterkr · 2017-07-06T09:06:54Z

src/shogun/base/SGObject.h

@@ -499,6 +501,16 @@ class CSGObject
 	 */
 	void set_seed(int32_t seed);


this you should drop

vigsterkr · 2017-07-06T09:08:32Z

src/shogun/base/SGObject.h

@@ -499,6 +501,16 @@ class CSGObject
 	 */
 	void set_seed(int32_t seed);

+	static void set_global_seed(uint32_t seed)


there's no need for this here... since you seed is a global thing.
just in init as other things have
set_global_seed and get_global_seed

vigsterkr · 2017-07-06T09:08:57Z

src/shogun/base/init.cpp

@@ -38,6 +37,7 @@ namespace shogun
 	SGIO* sg_io=NULL;
 	Version* sg_version=NULL;
 	CMath* sg_math=NULL;
+	__int32_t sg_random_seed = shogun::CRandom::generate_seed();


the generate_seed function should not be part anymore of CRandom...

vigsterkr · 2017-07-06T10:58:12Z

src/shogun/mathematics/Statistics.cpp

@@ -325,6 +325,7 @@ SGVector<int32_t> CStatistics::sample_indices(int32_t sample_size, int32_t N)
 	int32_t* idxs=SG_MALLOC(int32_t,N);
 	int32_t i, rnd;
 	int32_t* permuted_idxs=SG_MALLOC(int32_t,sample_size);
+	auto rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument as the default ctor uses global seed anyways right?

vigsterkr · 2017-07-06T10:58:18Z

src/shogun/base/DynArray.h

@@ -447,8 +448,12 @@ template <class T> class DynArray
 		/** randomizes the array (not thread safe!) */
 		void shuffle()
 		{
+			auto m_rng =
+			    std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:58:25Z

src/shogun/clustering/KMeansMiniBatch.cpp

@@ -131,7 +131,7 @@ SGVector<int32_t> CKMeansMiniBatch::mbchoose_rand(int32_t b, int32_t num)
 {
 	SGVector<int32_t> chosen=SGVector<int32_t>(num);
 	SGVector<int32_t> ret=SGVector<int32_t>(b);
-	auto rng = std::unique_ptr<CRandom>(new CRandom());
+	auto rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:58:35Z

src/shogun/features/DataGenerator.cpp

@@ -33,6 +33,7 @@ SGMatrix<float64_t> CDataGenerator::generate_checkboard_data(int32_t num_classes
 		int32_t dim, int32_t num_points, float64_t overlap)
 {
 	int32_t points_per_class = num_points / num_classes;
+	auto m_rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:58:40Z

src/shogun/features/DataGenerator.cpp

@@ -86,12 +88,13 @@ SGMatrix<float64_t> CDataGenerator::generate_mean_data(index_t m,
 	/* evtl. allocate space */
 	SGMatrix<float64_t> result=SGMatrix<float64_t>::get_allocated_matrix(
 			dim, 2*m, target);
+	auto m_rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:59:15Z

src/shogun/mathematics/Statistics.cpp

@@ -773,14 +775,15 @@ SGMatrix<float64_t> CStatistics::sample_from_gaussian(SGVector<float64_t> mean,

 	typedef SparseMatrix<float64_t> MatrixType;
 	const MatrixType &c=EigenSparseUtil<float64_t>::toEigenSparse(cov);
+	auto rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:59:21Z

src/shogun/mathematics/ajd/QDiag.cpp

@@ -16,6 +16,7 @@ SGMatrix<float64_t> CQDiag::diagonalize(SGNDArray<float64_t> C, SGMatrix<float64
 	int T = C.dims[2];

 	SGMatrix<float64_t> V;
+	auto rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:59:27Z

src/shogun/multiclass/LaRank.h

-				return patterns[i];
-			}
+			    auto m_rng =
+			        std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:59:32Z

src/shogun/optimization/liblinear/shogun_liblinear.cpp

@@ -512,13 +512,13 @@ void Solver_MCSVM_CS::solve()
 		}
 		state->inited = true;
 	}
-
+	auto m_rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

vigsterkr · 2017-07-06T10:59:40Z

src/shogun/structure/TwoStateModel.cpp

@@ -269,19 +269,23 @@ CHMSVMModel* CTwoStateModel::simulate_data(int32_t num_exm, int32_t exm_len,
 	SGVector< int32_t > ll(num_exm*exm_len);
 	ll.zero();
 	int32_t rnb, rl, rp;
-
+	auto m_rng = std::unique_ptr<CRandom>(new CRandom(get_global_seed()));


no need for the get_global_seed argument

MikeLing · 2017-07-07T06:03:17Z

So, in this pr, we had removed all static random generator and only share a static seed between different modules(because we want to make all the module have a fixed seed for the unit test). But, however, there are many tests(unit and meta test) failed for these changes. On locally, I have:

The following tests FAILED:
1:26 PM 212 - unit-NeuralLinearLayer (Failed)
1:26 PM 246 - unit-QuadraticTimeMMD (Failed)
1:26 PM 247 - unit-TwoDistributionTest (Failed)
1:26 PM 278 - integration_meta_cpp-clustering-gmm (Failed)
1:26 PM 282 - integration_meta_cpp-distance-cosine (Failed)
1:26 PM 299 - integration_meta_cpp-multiclass_classifier-random_forest (Failed)
1:26 PM 303 - integration_meta_cpp-neural_nets-feedforward_net_classification (Failed)
1:26 PM 304 - integration_meta_cpp-neural_nets-feedforward_net_regression (Failed)

All of them are rely on global random which has been removed from this and last pr. We wondering if they failed because the side effect of global random removal and we could assume(believe) we haven't hurt those modules inside after apply this pr.
Hi @karlnapf and @lambday , could you take a look at this pr? About global random and these the way to set and use random generator, please tell me if there has anything you don't like, thanks a lot!

vigsterkr

do you agree that CRandom(sg_random_seed) and CRandom() will construct the same object with the same state? because it does...
so there's really no need for
new CRandom(sg_random_seed)) it's enough to call new CRandom() right?

vigsterkr · 2017-07-07T08:44:33Z

src/shogun/mathematics/Statistics.cpp

@@ -325,6 +325,7 @@ SGVector<int32_t> CStatistics::sample_indices(int32_t sample_size, int32_t N)
 	int32_t* idxs=SG_MALLOC(int32_t,N);
 	int32_t i, rnd;
 	int32_t* permuted_idxs=SG_MALLOC(int32_t,sample_size);
+	auto rng = std::unique_ptr<CRandom>(new CRandom(sg_random_seed));


@MikeLing do you agree that CRandom(sg_random_seed) and CRandom() will construct the same object with the same state? because it does...
so there's really no need for
new CRandom(sg_random_seed)) it's enough to call new CRandom() right?

vigsterkr · 2017-07-07T08:44:51Z

src/shogun/mathematics/Statistics.cpp

@@ -711,12 +712,13 @@ SGMatrix<float64_t> CStatistics::sample_from_gaussian(SGVector<float64_t> mean,
 	int32_t dim=mean.vlen;
 	Map<VectorXd> mu(mean.vector, mean.vlen);
 	Map<MatrixXd> c(cov.matrix, cov.num_rows, cov.num_cols);
+	auto rng = std::unique_ptr<CRandom>(new CRandom(sg_random_seed));